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Dutch book in simple multivariate normal 
prediction: Another look 

Morris L. Eaton^ 

University of Minnesota 

Abstract: In this expository paper we describe a relatively elementary method 
of establishing the existence of a Dutch book in a simple multivariate normal 
prediction setting. The method involves deriving a nonstandard predictive dis- 
tribution that is motivated by invariance. This predictive distribution satisfies 
an interesting identity which in turn yields an elementary demonstration of 
the existence of a Dutch book for a variety of possible predictive distributions. 



1. Introduction 

Ordinarily, showing that a popular inferential scheme suffers from dc Finctti's in- 
coherence (existence of a Dutch book) is met with surprise since incoherence is 
typically regarded as a serious indictment of a statistical method. An instance of 
this incoherence occurs in a simple p-dimensional multivariate normal setting. The 
statistical problem is to predict the next observation in a sequence of indepen- 
dent and identically distributed normal observations (with mean and unknown 
p X p positive definite covariance matrix E). The "usual" predictive distribution is 
obtained from a formal Bayes calculation using the Jeffreys' invariant prior distri- 
bution for E. This is detailed in Eaton and Sudderth [')] where it is shown that 
the "usual" solution is in fact incoherent (a Dutch book can be made against the 
"usual" predictive distribution). The arguments in Eaton and Sudderth ['.)] are 
neither simple nor intuitive, but variations on these have been used effectively to 
extend the Dutch book argument to other multivariate settings (see Eaton and 
Sudderth [10, 11, 12, 13]). Some background information on incoherence and Dutch 
book is given in Section 2 below. 

Recently, Eaton and Freedman [8] presented a relatively simple and self-contain- 
ed argument that the Jeffreys' invariant prior resulted in a Dutch book in the 
simple normal prediction problems. The demonstration relies mainly on sampling 
properties of the normal distribution rather than on invariance arguments (see 
Eaton and Sudderth [*)]). However, the Eaton and Freedman [S] arguments seem 
to be rather special and not easily adaptable to other invariant proposals such as 
those discussed in Bj0rnstad [2]. 

The focus of this paper is again the simple normal prediction problem of Eaton 
and Sudderth [*)] and Eaton and Freedman [:•]. The purpose of writing the current 
paper is to present an argument that: 
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(i) reveals the role of the invariance in the incoherence (existence of a Dutch 
book); 

(ii) relies primarily on calculus so is mainly self-contained; 

(iii) yields the Dutch book conclusion for many predictive proposals in the simple 
normal prediction problem. 

To put things into perspective it is useful to sketch the argument used below. As 
in Eaton and Sudderth [9] , let be the group oi p x p lower triangular matrices 
with positive diagonal elements. Then the unknown covariance matrix E in the 
normal model can be uniquely expressed as S = 96' , 9 G G^. Using the right Haar 
measure on ^^ G as an improper prior distribution (rather than the Jeffreys' 
prior on E), a formal Bayes calculation yields a predictive distribution that we 
denote by Qh- The predictive distribution Qh is invariant under the group Gj. 
Details and a full explanation are given in Section 3. 

The remainder of the paper is devoted to an argument that allows one to com- 
pare almost any Gj invariant predictive distribution Q with the special predictive 
distribution Qh above. The argument is based on recent work of Zhu and on 
a density based approach of Eaton and Sudderth [I '!]. In essence, the constructive 
method described in Sections 4 and 5 shows that if an invariant predictive distribu- 
tion Q has a density g, and if q differs on a set of positive Lebesgue measure from 
the density of QH^ then Q is incoherent (a Dutch book exists). This result is used 
to show a Dutch book exists for several well-known predictive proposals. 

Finally, we should mention that Qh is not incoherent because the group G^ is 
amenable. See Eaton and Sudderth [If], Theorem 8.1, for a general discussion and 
a proof. 

2. Background 

The focus of this paper concerns a method for the evaluation of predictive distri- 
butions in a simple multivariate normal sampling situation. However, it is useful to 
first describe what we mean by a prediction problem and to detail the evaluative 
criterion of interest here. 

The origin of our formulation of the prediction problem stems from Laplace [17] 
(see Stigler [19] for an English translation of the original French). Consider a sample 
space [X ^ Bi) and an observation (usually a vector or a matrix) X m. X . K variable 
Z taking values in Z is to be predicted on the basis of an assumed joint parametric 
probability model 

P{dx,dz\9), 61 ee. 

Here is a parameter space and 9 is an unknown parameter. By a predictive 
distribution we mean a probability distribution Q{dz\x) for Z that is allowed to 
depend on the observed value ol X — x. The primary example of concern throughout 
this paper is the following. 

Example 1. Let Xi, . . . , X„ be independent and identically distributed (iid) with a 
p-dimensional multivariate normal distribution Np(0, E) with mean and unknown 
p X p positive definite covariance matrix S. It is assumed that n > p, and p > 2. 
The sample space X is the set of all p x n matrices with rank p (a set of Lebesgue 
measure zero has been discarded, for convenience). Thus, the matrix 



X = (Xi, . . . ,X„) : pxn 
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is the "data point" in X and 

p 

(2.1) S = XX' ^ X,X[ : pxp 

1=1 

has rank p. The variable Z to be predicted is assumed to be Np{0, E) and indepen- 
dent of the X's. One naive predictive distribution for Z is 

(2.2) Qi(.|x) = 7Vp(0,n-is), 

where x is the observed value of X and s is the observed value of S. Of course, the 
intuition is that n~^S is an unbiased estimator of S based on a minimal sufhcient 
statistic. An alternative proposal, based on a formal Bayes argument and the so- 
called Jeffreys' improper prior distribution (see Eaton and Freedman [S] for some 
discussion and details), yields 

(2.3) Q2{dz\x) = q2iz\x)dz, 

where dz is Lebesgue measure on Z = W and the density q2{-\x) is 

(2.4) q2{z\x) = C^^j, 



The constant Cn,p 



(2.5) Cn.p 



'^(1-f z's-lz)(»+l)/2' 



r(^) 
^p/2r( 2 



and |s| is the determinant of s. This ends the introduction of Example 1. 

We now return to the general prediction setting at the beginning of this section. 
Stone [20] introduced the notion of strong inconsistency as a criteria for excluding 
certain types of probability distributions in inferential settings. In the prediction 
framework, this idea takes the following form. 

Definition 2.1. A predictive distribution Q{dz\x) is strongly inconsistent (SI) 
with the model {P{dx,dz\6)\6 G 8} if there exists a measurable function f{x,z) 
with values in [— 1 , 1] and an e > so that 

(2.6) sup [ f{x,z)Q{dz\x) + e<M [ ( f{x,z)P[dx,dz\9). 



The intuition behind SI is that when (2.6) holds, no matter what the distribution 
for A", say midx), 

/(.,=)W.|.)„,(o,.) + ,<///(.,.)P(....*), 

for all 9. Thus, if Q{dz\x) is used as a distribution for Z after seeing X = x, 
then under all possible models for {X,Z) that are consistent with Q{dz\x), the 
expectation of / is at least e less than any expectation of / under the assumed 
model. Hence the terminology, strong inconsistency. 

Ramsay [18] and independently de Finetti [•'!, 4] introduced the notion of betting 
schemes for the evaluation of proposed probability distributions. Their ideas were 
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extended by Freedman and Purves [14] to cover cases with conditional bets. A 
somewhat modified approach, due to Heath and Sudderth [15], is relevant to the 
discussion below. In the prediction case when the spaces are infinite (as in Example 
1), the Ramsay-de Finetti-Freedman and Purves scheme takes the following form. 
Let C be a subset oi X x Z and let Cx = {z\{x, z) S C} be the a;-section of C . If an 
inferrer is using Q{dz\x) as a predictive distribution and X = x is observed, then 
Ca: C 2^ is assigned the probability Q{Cx\x). Therefore 

*o(a;,2:) = Ic{x,z) - Q{Cx\x) 

has Q{-\x) - expectation zero. A standard interpretation of as a payoff function 
is: 

After seeing X = x, the inferrer gives Cx probability Q(Ca:|x). A gambler can pay 
Q{Cx\x) - dollars for a ticket worth 

r $1 ii z 

\ liZf^Cx- 

Obviously, the net payoff to the gambler is ^o(x, z). The inferrer regards the bet as 
"fair" since 'I'o has expectation zero under Q{-\x). 

Slightly more complicated betting scenarios are constructed as follows. Suppose the 
gambler can pick subsets Ci, . . . , in A" x Z and pays Ci{x)Q{Ci^x\x) dollars for 
a ticket worth 

r $c,;(x) ifzea,, 
\ ifz^a,,, 

for i — \, . . . ,r. The numbers Ci{x) are assumed to be bounded functions of a;, but 
need not be non-ncgativc. The net payoff to the gambler is computed by summing 
the individual payoffs, so the net payoff function is 

r 

(2.7) ^!{x,z) = Y,c^{x)[IcAx,z)~Q{C,,x\x)]. 

i=l 

Again, since '^{x, •) has Q{-\x) expectation 0, the inferrer regards the betting scheme 
of the gambler as fair. 

Here is de Finetti's notion of incoherence adapted to the current prediction 
setting. 

Definition 2.2. The predictive distribution Q{-\x) is incoherent if there is an e > 
and a payoff function of the form (2.7) such that 

(2.8) fe*(X, Z) > e for all 6 eO. 

In other words, the predictive distribution is incoherent if the gambler has a 
uniformly (over 9) positive expected gain under the model. The discussion leading 
to Definition 2.2 comes from Heath and Sudderth [15]. The Freedman and Purves 
[14] formulation was in terms of odds rather than payoff functions. 

In examples, the verification of SI or incoherence is typically not straightforward. 
However, as shown recently in Eaton and Freedman [8], in the prediction setting 
of this paper, the two notions are equivalent (see Theorem 2 on p. 868-869). For 
the example under consideration here, SI is established for a variety of predictive 
distributions (see Section 5). 

Finally, the term "Dutch book" is used as a synonym for incoherence. When 
incoherence obtains, standard terminology is to say "Dutch book" can be made 
against the predictive distribution Q(-\x). See Eaton and Freedman [8] for some 
discussion and a bit of history. 
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3. The Haar predictive distribution 

As in Example 1, consider Xi, . . . , X„ iid Np(0, E) where S is an unknown p x p 
positive definite matrix. It is assumed that n > p so the matrix S in (2.1) is positive 
definite when the sample matrix 

X = {Xi, . . .,Xn) ■■ pxn 

is an element of X. A variable Z £ = Z which is Np{Q, E) is to be predicted after 
seeing X ~ x £ X. Here Z is independent of ATi, . . . ,Xn- In short, the statistical 
problem of concern in this paper is to produce a predictive distribution Q{dz\x) for 
Z E RP after seeing the data X = x. The focus of this section is on a particular 
predictive distribution obtained via a formal Bayes calculation. 

Recall that is the group of p x p lower triangular matrices with positive 
diagonal elements. For use in what follows, we list some facts about (see Eaton 
[7], especially Chapter 1 and in particular, pages 18 and 19). Elements g G Gj have 
positive diagonals, gu^i = I, . . . ,p and gij = for i < j. The symbol "di?" denotes 
Lebesgue measure on G^ (as an obvious open subset of i?p(p+i)/2). The measure 

(3.1) Mdg) = — 

1=1 

is a right invariant (Haar) measure on G^. The function 

(3.2) A{g)^f[gf-'^^' 

'i=i 

is the modular function of Gj and 



(3.3) Mdg) ^ Aig^ridg) 



dg 



is a left invariant measure on G^ . 

Given a, p x p positive definite matrix E, there is a unique clement T E such 
that E = TT' (see Eaton [G], Proposition 5.4 for a proof). This element is denoted 
by t{E) in some expressions below. In particular, = t(S) is a reparameterization 
of covariance matrices. In this parameterization, the density function of X, with 
respect to Lebesgue measure on A", is 

(3.4) /,(a;|0) = J-L-^cxp|--tr(00')"'sj, ^(^X 

where s = xx' : p x p, 9 E G^, and "tr" denotes trace. Of course the density 
function of Z on Rp is 

(3.5) f2iz\9) = i^exp|-itr(00')-'^^' 

As usual, I . I denotes the determinant. These two densities define the probability 
models 



(3.6) 



Pi{dx\e), 0Ee = G+, xex 
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and 

(3.7) P2{dz\9), eeQ, zeRP, 

for X and Z respectively. Recall X and Z are assumed independent so the joint 
model for {X, Z) is 

(3.8) P(dx,dz\e) = Pi{dx\e)P2{dz\0) 
and the joint density is 

(3.9) fix,z\e) = hix\e)f2{z\e), ix,z)exxz. 

For g e G^, it is easy to show that 

(3.10) M9x\ge) = \gr"hix\e) 
and 

(3.11) f2{gz\ge) = \g\-\h{x\e). 

These two invariance properties of the densities imply that for the model (3.8), 

(3.12) P{9B\g9) = P{B\e), 

for all Borel sets B C X x Z and all 6, g G G^. In other words, the assumed 
statistical model is invariant. Note that the group acts transitively on 
Q = Gy. In such invariant settings, it was argued in Eaton and Sudderth [13] 
that the use of the right Haar measure as a prior distribution may yield predictive 
distributions with interesting inferential properties. 

We now proceed to calculate, via a formal use of Bayes Theorem, the predictive 
distribution (henceforth called the Haar inference) induced by using Vr{dO) in (3.1) 
as an improper prior. To this end, let 

(3.13) m{x,z)^ [ f{x,z\e)vr{de) 

Jg+ 

and 

(3.14) mi{x) = / fi{x\9)Mde), 

JG+ 

for X ^ X and z ^ W . For fixed x, 

(3.15) qH{,z\x) = ■m{x,z)/mi(x) 

is a density on W and by definition, the predictive distribution QH{dz\x) with 
density (3.15) is the Haar inference . Using (3.10), (3.11), and the properties of Vr, 
it is easy to show that qn is invariant in the sense that 

(3.16) qH{gz\gx) = \g\^^qH{z\x). 
From (3.16), the invariance of Qh, 

(3.17) QHigBlgx) = Qh{B\x) 
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is immediate. 

Because the calculation is not quite standard, we sketch the details in the deriva- 
tion of qn in (3.15). For w G RP ^ define the function ^p{w) by 

{1 if p = 1 

(1 + w'wr^p-^y^ Y[ii+wi + ...+wf) if p > 2. 
i=i 

Lemma 3.1. For w G , recall T{Ip + ww') is the unique element in that 
satisfies (Ip + ww') = [t{Ip + ww')][T{Ip + ww')]' . Then, 

(3.19) A(T(/p + ww')) = ^p{w) 
where A is the modular function given in (3.2). 

Proof. The proof of this (a messy calculation) is not too hard via an induction 
argument on dimension p. The details are omitted. □ 

Theorem 3.1. Let ko be the density on RP given by (see 2.4) 

(3.20) ko{w)- ^ 2 ^ 



^p/2Yi^!l:zEtl) (1 + w'w)("+i)/2 
Then 

(3.21) ki{w)=ko{w)- ^ 



i'p{w) 

is a density on R^ and 

(3.22) qH{z\x) = \L\-^ki{L-^z) 
where xx' = s = LL' with L G . 

Proof. Using the expressions (3.4) and (3.5) and the fact that i>r transforms to vi 
under the mapping 6 — > 9^^ in Gj, it follows from (3.15) that 

(3.23) q^izlx) = (2.) J\e\^eM-¥rO'Os}Mde) ' 

Writing s ~ LL' with L G G^ and setting w = L^^z, some algebra and a change 
of variable yields 

^_^^ J\e\"+^exp{-^trieL)'ieL){Ip + ww')}MdO) 
qH[z\x) - (ZTT) ^ \9\-^cxp{-^tr{eL)'{0L)U{d0) 

= ^j^^-i(n^.-v/2 l\(^\''^'^M'^tr0'e{Lp + ww')}Md0) 
^ ^ ^ ^> J\e\"cxp{~^tre'e}iyi{d9) 

Setting Lp + ww' = UU' , U G Gj and changing variables in the numerator integral 
above gives 

|L|-i|[/|-("+i) J\e\-+^cM-^tr9'0}MdO) 
^ ' '^^^'"'^ (27r)p/2A(f/) ■ J\e\^exp{-^tr9'e},yi{d0) ' 
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Now, note that 

' ' ' ' (1 + w;'m;)(»+i)/2 

and from Lemma 3.1, 

A(;7) = ^piw). 

Finally, a standard (but not routine) multivariate calculation yields 
1 J\0\'^+'^exp{-^tr\e'9}i^i{d9) r(2±i) 



(27r)p/2 /|6l|"cxp{-iir|6l'6i}iy,(d6l) ^p/2r("-(p+i) ) ' 

Piecing all of this together yields the expression (3.22) for qH{z\x). The fact that 
ki is a density on Rp follows by setting L = Ip in (3.22). □ 

Let q{z\x) > be an arbitrary predictive density for Z with X = x. That is, 
q{-\x) is a density on Rp for each x X and q is jointly measurable on Z x X. 

Definition 3.1. The density q{z\x) is G^-invariant if for all g E G^, 
(3.25) q{gz\gx) = |(7|^"'^(j'(z|x) for all z,x. 



Each G^-invariant predictive density yields a G^-invariant predictive distribu- 
tion Q{dz\x) given by 

Q{B\x) = [ q{z\x)dz, B C RP. 
Jb 

The invariancc of Q, namely Q{gB\gx) ~ Q{B\x), is immediate from (3.25). 

The density of Qi in (2.2) and Q2 in (2.3) are both G^-invariant. Further, when 
specialized to the case of mean zero considered in this paper, the first seven entries 
in Table 1 of the survey paper of Keyes and Levy [Ki] are all G^-invariant. It is 
such predictive densities that are compared to qn in Section 5. 



4. Zhu's result 

Using some results of Eaton and Sudderth [12], Zhu [21] was able to establish an 
interesting and useful relationship between an invariant prediction model and the 
predictive distribution obtained from the right Haar measure. Zhu's result will be 
stated here only for the normal model under consideration. For a simplified version 
and proof of this general result when densities exist, see Eaton and Sudderth [I.!]. 
The most general version is Theorem 3.4.1 in Zhu [21]. 

In the notation of Section 3, let Pi{dx\9), P2{dz\9) and P{dx, dz\d) be the proba- 
bility measures in (3.6), (3.7) and (3.8) respectively. Also, let QH{dz\x) be the Haar 
inference defined by the density qH{z\x) in (3.15). Next, introduce the Haar model 
given by the probability measure 

(4.1) PH{dx,dz\9) = QH{dz\x)Pi{dx\e) 

on X X Z. 

Recall that a real valued function / on X x Z is Gj-invariant if for all g G G^ 



f{9x,gz) = f{x,z) foraUa;,2. 
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Theorem 4.1. The original model P{dx, dz\9) and the Haar model Pnidx, dz\9) 
agree on the bounded invariant functions. That is, for each bounded measurable 
-invariant f and for all 9, 

(4.2) J j f{x,z)P{dx,dz\9) = J J fix,z)PHidx,dz\9). 

The imphcations of (4.2) in more general settings are discussed in Eaton and 
Sudderth [13]. Because densities exist in the setting of this paper, Theorem 3.1 in 
Eaton and Sudderth [13] appHes directly to this case so we omit a proof. 

Remark 1. In the proof of Theorem 3.1 in Eaton and Sudderth [j i], there are 
two misprints: On page 500 of this paper in lines 8 and 9 from the top, "p(x, z\g9y' 
should be "p{x,z\9)J' 

The use of (4.2) in this paper occurs in the next section which deals with the 
Dutch book argument. 

5. Dutch book 

Consider a predictive density q{z\x) that is G^-invariant. The purpose of this sec- 
tion is to show that if q{z\x) is essentially different from qH{z\x), then the predictive 
distribution Q(dz\x) determined by q is incoherent so a Dutch book exists. The con- 
struction of the required pay-off function is explicit and both (2.6) and (2.8) are 
verified directly. 

To make the above precise, consider the set 

(5.1) C = {{x, z)\q{z\x) < qH{z\x)} C X x Z. 
Then 

(5.2) Co: ^ {z\{x,z) eC} 

is the x-section of the set C. Let I denote Lebesguc measure on X x Z, li denote 
Lebesgue measure on X and I2 denote Lebesguc measure on Z. Clearly, l{dx, dz) = 
li{dx)l2{dz) . From Tonelli's Theorem (see Dunford and Schwartz [•")], p. 194), 



(5.3) 1{C) - / hiCMdx). 

J X 

Theorem 5.1. If 1{C) > 0, then Q{dz\x) is incoherent (strongly inconsistent) and 
a Dutch book exists for the predictive distribution Q(-\x). 

Proof When 1{C) > 0, the set 

(5.4) D = {xMC.,) > 0} C A- 
must have positive /i-measure. Consider the function 

(5.5) (b{x,z)=lD{x)[IcAz)-QiC.\x)]. 

We first claim that the function (j){x, z) is G^-invariant. To see this, first note that 
C is an invariant set from the invariance of qn and the assumed invariance of q. 
From this it follows easily that 

(^gx — 9^xj 9 S G^. 
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Therefore the set D is invariant and x — > Q{Cx\x) is an invariant function of x. 
The invariance of </> in (5.5) is now immediate. 

We now verify (2.6) with / = cj). First, for each x, 



{x, z)Q{dz\x) = 0. 
Since 4> is invariant and the model is invariant, 
(5.6) 9 — >J j (j){x,z)P{dx,dz\9) 

is an invariant function of 9. Because = G|J^, the right side of (5.5) is constant in 
9, say 



eo = y J (f>{x,z)P{dx,dz\9) for all 6*. 
Now, apply Theorem 4.1 to obtain 

eo^ J J <j){x,z)P{dx,dz\9) 

4>{x, z)PH{dx, dz\9) 

lDix)[IcAz) ~ QiC^\x)]QHidz\x)Piidx\9) 

= f f [qH{z\x)-qiz\x)Mdz)fiix\9)h{dx) 
J D Jc^ 

where /i is given in (3.4). This set D has positive ^i-measure, and the density fi 
is positive everywhere. Also, for all x G D, C^; has positive I2 measure and for all 
z G Cx, qH{z\x) > q{z\x). Thus eg > and (2.6) holds with e ~ eq- Hence SI holds. 

To see that incoherence holds, take the net payoff function to be (j){x, z) (in (2.7), 
take r = 1, Ci{x) = Id{x) and Ci = C). A repeat of the argument above shows 
(2.8) holds with e = eo. This completes the proof. □ 

Before discussing any examples, it is useful to next consider the case when the 
set C has measure zero. 

Theorem 5.2. If 1{C) ~ 0, then the set D in (5.4) has h-measure zero and 

(5.7) Q{-\x)^Qh{-\x) a.e-ih). 

Proof. When 1{C) = 0, (5.3) shows hiCx) = a.e. (li). Hence D has Zi-measure 0. 
But for X £ D'^, hiCx) = 0. For this x, a standard argument shows that Q{-\x) — 
Qh{-\x). Therefore (5.7) holds and the proof is complete. □ 

We now proceed to give a wide class of examples where the set C has positive 
Lesbesgue measure so Theorem 5.1 applies. To this end, let fc be a density function 
on RP and as usual, for x e X write 

XX — LL , L G Gq^. 

Observe that the predictive density qk{z\x) given by 

(5.8) qkiz\x) = \L\-'k{L-'z) 
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is GJ-invariant and is obviously determined by k. Note that the predictive dis- 
tributions (2.2) and (2.3) both have predictive densities of the form (5.8) for an 
appropriate k. Further, the Haar predictive distribution Qh derived in Section 3 
has the form (5.8) with k = ki where ki is given in (3.21). 

The next rcsuh shows that any predictive distribution of the form (5.8) is inco- 
herent when k is different (on a set of positive Z2-nicasure) from the special density 
ki that defines the Haar inference Qh- 

Theorem 5.3. Let k be a density on BP and let ki he the density defined in (3.21). 
Assume that 

(5.9) j \k{z) - ki{z)\dz > 0. 

Then the predictive distribution Qk with predictive density (5.8) is incoherent. 
Proof. For each x G X, the variation distance between Qh{'\x) and Qk{'\x) is 



sup \Qk{B\x)-QH{B\x) 

BCRP 



(5.10) 



sup 

BCRP 



[ [\L\-^k{L-^z) - \L\-^ki{L-'^z)]dz 
Jb 



\k(z) — ki{z)\dz. 



The second equality follows by a simple change of variable and the well known 
identity involving variation distance (for example, see Billingsley [1], p. 224 for the 
argument). Since the last expression in (5.10) is positive by assumption, we see that 
Qk{-\x) 7^ Qh{'\x) for all x. Thus the set D in (5.4) is X and hence Theorem 5.1 
applies. The conclusion follows. □ 



As an application of Theorem 5.3, by taking the mean to be zero in a MANOVA 
model, and applying the results listed as items 2 through 7 in Table 1 of Keyes and 
Levy [16], one obtains examples of predictive distributions which are incoherent. 
Predictive distributions 3 through 7 are obtained via a formal Bayes calculation 
with an improper prior distribution of the form 

(5.11) upidE) = ISI" '^^ 



|S|(P+l)/2' 

where /3 satisfies (3 < {n — p + l)/2. The restriction on f3 is necessary so the formal 
Bayes calculation yields a proper posterior for our example. The improper prior 

(5.11) yields a predictive distribution with a density of the form 

(5.12) qp{z\x) = C\L\- 



(l + (L-iz)'(L-iz))("+i-2/3)/2' 



where C is a constant. Theorem 5.3 implies that all such predictive distributions 
are incoherent. The details are routine and left to the reader. 
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